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A SERIAL BINARY MULTIPLIER 

The present invention relates to a serial binary multiplier for performing fixed 
point multiplication in data processing apparatus. 

Central processing units of data processing apparatus generally incorporate a 
multiplier unit for performing multiplication operations. Typically such multiplier 
units are based on well known array multiplier designs or a shift-and-add algorithm. 
Multiplier units of this kind are generally optimised for performance (i.e. processing 
power and speed) or for compact implementation. 

One example of a multiplier unit having compact size is described in our co- 
pending international patent application No. GB97/01520. 

However, the performance of a serial multiplier, unlike an array multiplier 
design, is dependent on the total transmission delay in performing a sequence of 
operations as the serial data is received. The total transmission delay is a combination 
of several delays in the sequential operation of the multiplication process, namely: a 
delay as the data is routed to the input of the multiplier; a delay as the data passes 
through the interconnect; and the multiplier operation delay. 

In monolithic design circuit performance has improved many fold as 
semiconductor processing techniques have lead to smaller and smaller component 
geometries. Contemporary integrated circuit process technology enables the 
manufacture of deep sub-micron circuit elements v^th physical dimensions of less 
than one micron. The performance of these circuits is often no longer determined by 
the operation of the active circuit components but is dominated by the interconnect 
delay between them. 

The difference between the performance of active components, for example 
transistors, and the interconnect, or routing, is greatly exaggerated in the 
implementation of programmable circuits such as Field Programmable Gate Arrays 
(FPGAs), where greater flexibility in the interconnect structures adds further to the 
delay imposed on signals passing through them. 

An alternative known approach to constructing a high performance multiplier 
is to base the design around a look-up-table. This is demonstrated in Altera's FLEX 
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lOK device. Using this technique all the possible results of the multiplication process 
are stored in a table and the input operands are used to choose one result from the 
table. The size of such multipliers becomes very large w^hen, say, operands of 8 bits 
or more are used. The multiplication of n-bit wide operands requires a table with 2^" 
entries. An improvement to this design is to use multiple smaller look-up tables 
followed by a calculation step. This technique is also shown in Altera' s FLEX lOK 
device. The latter technique reduces the size of the multiplier but degrades the 
performance since a further calculation step is required after a preliminary result has 
been selected from the look-up table. 

It is an object of the present invention to obviate or mitigate the aforesaid 
disadvantages and to improve the performance of the data processing function of 
multiplication. 

According to a first aspect of the present invention there is provided a serial 
binary multiplier for multiplying two binary operands to provide a final product, the 
multiplier comprising means for storing at least one first operand, a register for storing 
a partial product of the multiplication operation, means for receiving elements of a 
second operand serially, a calculation unit for calculating all possible results being the 
sum of the partial product and the product of the first operand with all possible values 
of the element of the second operand, said possible results being calculated during 
transmission of the . second operand, means for selecting either one of the possible 
results or the currently stored partial product on the basis of the value of the received 
element of the second operand, means for shifting the partial product in the register to 
provide a new partial product, and means to output the contents of the register as the 
final product when all bits of the second operand have been received. 

By using the calculation unit to calculate partial products whilst the second 
operand is transmitted the delay in transmitting the data is less significant in the 
overall time required to conduct the multiplication process. 

Preferably the second operand comprises a plurality of elements each 
comprising an m-bit word. In an embodiment where m=l the calculation unit is an 
adder. 
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The calculation unit calculates all possible results on the basis of the value of 
the first operand and the value of previously received elements of the second operand. 

The means to output the contents of the register preferably provides the final 
result in serial form. 

In one preferred embodiment the first and second operands and the final 
product are in two's complement form and the possible results are calculated fi"om the 
first operand, the partial product and the previously received bit of the second 
operand. In such an embodiment the calculation unit is an adder and subtractor and 
may take the form of a single circuit capable of addition and subtraction, the operation 
being determined by the value of the previously received bit. 

According to a second aspect of the present invention there is provided a 
method of operating a serial binary multiplier for multiplying two binary operands to 
provide a product comprising the steps of storing a first operand, storing a partial 
product in a register, transmitting elements of a second operand serially whilst 
simultaneously calculating all possible results being the sum of the partial product and 
the product of the first operand with all possible values of the element of the second 
operand, selecting either one of the possible results or the currently stored partial 
product on the basis of the value of the received element of the second operand, 
shifting the partial product in the register to provide a new partial product, and 
outputting the contents of the register as the final product when all bits of the second 
operand have been received. 

Specific embodiments of the present invention will now be described, by way 
of example only, with reference to the accompanying drawings in which: 

Figure 1 is a block diagram of an embodiment of an m-bit binary serial 
multiplier according to a first embodiment of the present invention; 

Figure 2 is a block diagram of an embodiment of a 1-bit binary multiplier 
according to a second embodiment of the present invention; 

Figure 3 is a block diagram of the multiplier of figure 2 adapted for two's 
complement operation according to a third embodiment of the present invention; 
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Figure 4 is a table showing the calculation process of the multiplier shown in 
figure 3; and 

Figure 5 is a timing diagram for a single cycle of the multiplier operation. 

Referring now to the drawings, figure 1 shows the structure of an m-bit serial 
multiplier which performs the multiplication operation on a locally stored first 
operand B and a second operand A that is transmitted to multiplier in the form of a 
serial stream of m-bit wide data elements, the m bits of each data element being 
received in parallel and multiple serial data elements forming the complete operand 
data word. 

The multiplication process is performed by a calculation unit that comprises a 
bank of 2"" registers 1 and a bank of 2"" adders 2. The registers 1 store all possible 
results of multiplying the first operand B with all possible (2"™) values of the second 
operand A. Each register 1 within the bank stores one result of multiplying the first 
operand B with an assumed value of the second operand A. Each of these 2"* 
multiplication results is passed to one of the adders 2 in the bank of 2"" adders where it 
is summed with a partial product of the overall multiplication process that is stored in 
a shift register 3. The results of the addition process are then passed to a multiplexer 4. 

A decoder (not shown) receives the m-bit serial input data element of the 
second operand A and on the basis of this, selects the appropriate correct result via the 
multiplexer 4. Thus the input data is used to select a pre-calculated result late in the 
calculation process. The selected (partial) result is then stored in the shift register 3 
which reformats the partial result by shifting the stored data by m-bits to the right. 
The partial result is then recirculated to the input of the bank of adders 2. The 
multiplication process described above is then repeated for the next received data 
element of the second operand A vmtil the whole of the input data word of the second 
operand A has been received and processed. If the input data represents the value zero 
then the recirculated output of the shift register can simply added to the register 3 
rather than selecting the appropriate adder output. The final result in the shift register 
3 is transmitted to a parallel to m-bit serial converter (not shown) which outputs the 
final result (product) in the original m-bit serial format. 



The above described multiplier allows the parallel operation of both the 
multiplier operation (including addition of the products to the partial result in the shift 
register 3) and the input data transmission. Using a locally stored first operand B a 
number of possible multiplier results is pre-calculated independently of the second 
operand A and added to the partial result from the previous cycle. In this way the 
multiplication process delay and the data transmission delay occur simultaneously, or 
in parallel. The second operand A is only needed to complete the multiplication 
process by selecting one of the pre-calculated results. By employing a decoder that 
selects the appropriate partial result the delay generally associated with the 
multiplication process is reduced, whilst avoiding the need for a large look-up table of 
possible results. 

It will be appreciated that by using the locally stored first operand B in the 
preliminary multiplication process, the number of possible pre-calculated results is 
greatly reduced in comparison to conventional multipliers based on look-up table 
designs. 

Figure 2 shows an embodiment of the present invention that is used to 
multiply 1 -bit serial input data. A 1 -bit serial multiplier is highly suited to realisation 
within a programmable device, since implementing programmable intercormects 
between functional units that only require a single point-to-point connection is both 
practical and well known. 

The operation of the 1-bit multiplier is similar to that of the generic m-bit 
multiplier example described earlier, however, using a 1-bit wide input format allows 
a novel optimisation of the circuit. 

Since the input data of the second operand must be either a 1 or a 0 then only 
one dynamic calculation is required as there are only 2 possible results, one of which 
is a null operation (i.e. multiplication by zero). The structure of the 1-bit multiplier 
varies from the m-bit multiplier in that the calculation imit only comprises a single 
register to store the first operand B and a single adder. Parts corresponding to those of 
figure 1 are indicated by the same reference numerals increased by 100 and are not 
further described except insofar as they differ from their counterparts in figure 1 . 



6 



The calculation unit 1, 2 shown in Figure 1 can be constructed in the 1-bit 
multiplier embodiment of figure 2 by using a register store for operand B and a single 
adder 102. The implementation of such a circuit is well known. When the 
multiplication operation is initiated the previous serial input bit is taken to be a zero. 
Once the current signal data input bit of the second operand A has been received it is 
used to determine whether the selected result is to be the result dynamically calculated 
by the adder (the sum of the received bit of the second operand A and the partial 
product in the register 103) or the previous partial result (i.e. no operation is 
performed). The final result is output via a parallel to serial converter 105. 

Figure 3 shows a multiplier design for multiplication of 1-bit operands in 
two's complement format. The serially transmitted second operand A is decoded by a 
decoder 207 and the output provides instructions to an adder/subtractor 208 to choose 
the dynamic calculation operation i.e. either to add or to subtract the local operand B 
to or from the partial result that is fed back fi-om the shift register 203. These add and 
subtract instructions are decoded from the previous signal data input bit and allow the 
dynamic calculation to be performed in parallel with the current signal data bit being 
transmitted. When the multiplication operation is initiated the previous serial input bit 
is taken to be a zero. Once the ciorrent signal data bit has been received and decoded it 
is used to determine whether the selected result is to be the result dynamically 
calculated by the multiplier or the previous partial result (i.e. no operation is 
performed) according to the table shovm in figure 4. 

The timing diagram for a single cycle of the 1-bit two's complement multiplier 
operation is shown in Figure 5. The opening part of the clock cycle is available for the 
independent dynamic calculation of partial result(s) on the basis of the previously 
received data bit, and for the transmission of the current data bit. This is shown as 
"Tmult" in Figure 4. The remaining part of the clock cycle is then dedicated to the late 
select process that requires simple decoding of the current serial input data bit, which 
may be easily constructed with simple logic gates to give very high performance. The 
delay attributed to this process is shown as "Tselect" in Figure 4. Clearly, overlapping 
the data transmission delay and the multiplier operation delay in this late select 
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multiplier design offers greatly improved performance over traditional serial 
multipliers. 

It will be appreciated that numerous modifications to the above described 
design may be made without departing from the scope of the invention as defined in 
the appended claims. For example, the shifting of the partial product stored in the 
shift register 3, 103, 203 may be performed by any equivalent operation such as 
modifying the connections to the register. The term "shifting" is used in the claims 
with the intention of incorporating such equivalent operations. 
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CLAIMS 

1. A serial binary multiplier for multiplying two binary operands to provide a 
final product, the multiplier comprising means for storing at least one first operand, a 
register for storing a partial product of the multiplication operation, means for 
receiving elements of a second operand serially, a calculation unit for calculating all 
possible results being the sum of the partial product and the product of the first 
operand with all possible values of the element of the second operand, said possible 
results being calculated during transmission of the second operand, means for 
selecting either one of the possible results or the currently stored partial product on the 
basis of the value of the received element of the second operand, means for shifting 
the partial product in the register to provide a new partial product, and means to 
output the contents of the register as the final product when all bits of the second 
operand have been received. 

2. A serial binary multiplier according to claim 1, wherein the elements of the 
second operand are m-bit words. 

3. A serial binary multiplier according to claim 2, wherein m=l and the 
calculation unit is an adder. 

4. A serial binary multiplier according to claim 1, 2 or 3, wherein the calculation 
unit calculates all possible results on the basis of the value of the first operand and the 
value of previously received elements of the second operand. 

5. A serial binary multiplier according to any preceding claim, wherein the 
means to output the contents of the register provides the final result in serial form. 

6. A serial binary multiplier according to any preceding claim, wherein the first 
and second operands and the final product are in two's complement form and the 



possible results are calculated from the first operand, the partial product and the 
previously received bit of the second operand. 

7. A serial binary multiplier according to claim 6, wherein the calculation unit is 
an adder and subtractor. 

8. A serial binary multiplier according to claim 7, wherein the calculation unit is 
a single circuit capable of addition and subtraction, the operation being determined by 
the value of the previously received bit. 

9. A method of operating a serial binary multiplier for multiplying two binary 
operands to provide a product comprising the steps of storing a first operand, storing a 
partial product in a register, transmitting elements of a second operand serially whilst 
simultaneously calculating all possible results being the sum of the partial product and 
the product of the first operand with all possible values of the element of the second 
operand, selecting either one of the possible results or the currently stored partial 
product on the basis of the value of the received element of the second operand, 
shifting the partial product in the register to provide a new partial product, and 
outputting the contents of the register as the final product when all bits of the second 
operand have been received. 

10. A serial binary multiplier substantially as hereinbefore described with 
reference to the accompanying drawings. 

11. A method of operating a serial binary multiplier substantially as hereinbefore 
described with reference to the accompanying drawings. 
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ABSTRACT 

A BINARY SERIAL MULTIPLIER 
A serial binary multiplier multiplies two operands to provide a product. A first 
operand is stored locally and a second operand is transmitted serially whilst 
simultaneously multiplying said first operand with all possible values of said second 
operand taking into account any received bits of the second operand. All possible 
results are added to the contents of a partial result register and stored and when a 
complete element of the full second operand has been received and decoded, the 
correct result is selected by the decoder. The new partial product is shifted in the 
register and when all the bits of the second operand have been received the final 
product is output to a serial to parallel converter. The method and circuit permit part 
of the multiplication process to be performed whilst the input data is still being 
transmitted thereby reducing the operation delay. In a 1-bit two's complement 
embodiment a decoder is used to decide whether to add or subtract the received bit of 
the serially transmitted operand to or from the contents of the partial result register. 
The decoder uses knowledge of the previously transmitted bit of the operand to make 
this decision 
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